Table of Contents

COVSMA's first tool: COVSCO

COVSMA stands for Copernicus Satellites Versus Maladies: The current sanitary crisis implied the necessity to develop an online tool to monitor pollution levels, display alerts that will imply that governments automatically take measures: days without cars and trucks that are not 100% Electrical, monitor the live impact of the measures taken, and forecast COVID19 risk for up to 4 days in the future, COVID19 risk being defined as the predicted numbers of new hospitalisations due to severe COVID19 cases for all states/departements. We have baptised this tool with the analog name of COVSCO (Copernicus Satellites Versus COVID19). We start with France and its 96 departements. A follow up will be to apply the same methodology to severe respiratory diseases and to expand the model and databases to a worldwide scale.

pf_1617544710.jpg

Data Exploration

The data

The features: 22 variables from which we will predict our target

The Target: New hospitalizations due to severe COVID19 cases

The daily number of new hospitalizations due to severe COVID19 cases for every French departement is what we will predict.

New hospitalizations means over all departements in function of 7davg pollutants concentrations differencials

New hospitalizations means over all departements in function of 1MMax Pollutants concentrations differencials

Ozone (O3) and the number of severe COVID19 cases leading to hospitalization

Departement 75: Paris region Ile de France

Departement 83: Var region PACA

Nitrogen dioxide (NO2) and the number of severe COVID19 cases leading to hospitalization

Departement 75: Paris region Ile de France

Departement 83: Var region PACA

PM2.5 and the number of severe COVID19 cases leading to hospitalization

Departement 75: Paris region Ile de France

Departement 83: Var region PACA

CO and the number of severe COVID19 cases leading to hospitalization

Departement 75: Paris region Ile de France

The most polluted departements of France

The Ozone O3 Pollutant

The PM2.5 Pollutant

The NO2 Pollutant

The CO Pollutant

The PM10 Pollutant

The facebook mobility index

The facebook mobility index VS Ozone O3

The facebook mobility index VS Carbon Monoxyde CO

The facebook mobility index VS Nitrogen Dioxide NO2

The facebook mobility index VS PM2.5

The facebook mobility index VS PM10

Training the model - Gradient Boosting for regression

Gradient Boosting for regression.

GB builds an additive model in a forward stage-wise fashion; it allows for the optimization of arbitrary differentiable loss functions. In each stage a regression tree is fit on the negative gradient of the given loss function.

Stack of estimators with a final regressor.

Stacked generalization consists in stacking the output of individual estimator and use a regressor to compute the final prediction. Stacking allows to use the strength of each individual estimator by using their output as input of a final estimator.

Note that estimators_ are fitted on the full X while finalestimator is trained using cross-validated predictions of the base estimators using cross_val_predict.

Hold-out and Cross Validation (MSE/MAE)

Feature importance Report

FIRclass1.png

FIRclass2.png

Exporting the model to a joblib file

Running T-POT Auto ML optimization algorithm

Recurrent Neural Network

Conclusion